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Abstract 



Numerical integrations in celestial mechanics often involve the repeated 
computation of a rotation with a constant angle. A direct evaluation of 
these rotations yields a linear drift of the distance to the origin. This is 
due to roundoff in the representation of the sine s and cosine c of the angle 
9. In a computer, one generally gets cP + s"^ ^ 1, resulting in a mapping 
that is slightly contracting or expanding. In the present paper wc present a 
method to find pairs of rcprcscntable real numbers s and c such that + 
is as close to 1 as possible. We show that this results in a drastic decrease 
of the systematic error, making it negligible compared to the random error 
of other operations. Wc also verify that this approach gives good results 
in a realistic celestial mechanics integration. 
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1 Introduction 



In some numerical computations, a rotation around a fixed axis by a constant 
angle 9 must be repeatedly applied. This occurs for instance in some long-term 
integrations in celestial mechanics where one must alternate between a fixed refer- 
ence frame - for integrating a Keplerian motion - and a rotating frame - to account 
for some rotating perturbing potential. A linear drift of the square distance to 
the axis is then generally observed |P, |10|, with the following properties: 

• The rate of drift, defined as the relative change of the square distance to 
the axis per rotation, is of the order of the roundoff error. For instance, if 
the computations are made in single precision, the relative change is of the 
order of 10"''. 

• For a given value of 9, the rate of drift is independent of the initial condi- 
tions. 

• The sign and the amplitude of the rate of drift seem to vary in quasi-random 
fashion with 9. 

These properties suggest a simple explanation for the drift. Let us call Z 
the rotation axis and (X, Y) the plane perpendicular to the Z-axis. Then Z is 
invariant in the rotation which is simply computed by 

(1) 





where ideally we should have 



cos 9, s = sin 9. (2) 



Actually, the values of c and s are rounded by the computer, and therefore + 
is not exactly 1. As a consequence, the mapping (|I]) is slightly contracting or 
expanding, in a systematic way since the same rounded values c and s are used 



for every iteration [10 



To illustrate, consider a computation in single precision, with roundoff errors 
of the order of 2~^^ (see below Sect.|^). We assume for simplicity that each step of 
the computation involves one rotation. Then after t steps, the cumulative error 
resulting from the systematic roundoff errors on c and s is eo ~ 2~^^t. This is 
shown by the dotted line in Fig. |l|. 

Other roundoff errors occur in the multiplications and additions involved in 
(|lD, and in other parts of the computation which have to be done at each step. 
However, these other errors are generally quasi-random, since different values of 
X, Y, and other variables are involved at each step. A reasonable conjecture is 
then that the cumulative effect has the nature of a random walk, and that the 



4 




Figure 1: Roundoff errors as a function of tlie number of steps, in single precision. 
Dotted line: eo = error due to the roundoff of cos 6' and sin 6', for an arbitrarily 
chosen 9. Dashed line: ei = error due to other roundoffs. Full line: €2 = errors 
due to the roundoff of cos 6* and sin 6', for a "good rotation" (Eq. (p!4D ). Dash-dot 
line: €3 = same for Eq. (|I5|) with k = 32. 
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error after t steps is of the order of ei ~ 2^^^v^. This is represented by the 
dashed hne in Fig. |l]. 

It can be seen that eo dominates. It is the cause of the observed linear drift. 

This drift can be a problem for long-term integrations. In the case of Fig. |l], 
for instance, it results in a complete breakdown of the computation after only 
2^^ ^ 10^ steps. It is therefore desirable to remove this drift or at least to decrease 
its rate. 

If there is some latitude in the choice of 6 (for example if it is determined 
by the choice of an integration step), then a natural idea is to select a value for 
which the roundoff error is very small. This is the topic of the present paper. 

M. Henon is responsible for the mathematical basis; J.-M. Petit, for the nu- 
merical simulations. 



2 Roundoff 

A real number is usually approximated on a computer by a representable number 
of the form 

(T X m X 2' (3) 

where a = ±1 is the sign, m is the mantissa and e is the exponent. 

In most cases, the number is normalized: the exponent is chosen in such a 
way that 1/2 < m < 1, i.e. the binary representation of m has the form 0.1 . . .. 
The 1 in the first position is then dropped and the next p — 1 binary digits are 
stored. Thus, m is of the form 

m = - -\ (4) 

where u is the stored integer, which lies in the range 

< < 2P-\ (5) 

Most computers today adhere to the IEEE754 standard |l|, ^ and use p = 24 
for single precision, p = 53 for double precision. 

We consider now the binary representation c of cos^. If |cos^^| = 1, it is 
exactly represented {e = 1, u = 0). If 1/2 < |c| < 1, the exponent is e = 0, and 
the representable values are 

|c| = 1/2 + 1^ (6) 

where u can take all values in the range (^). If 1/4 < |c| < 1/2, the exponent is 
e = — 1, and the representable values are 

M = l/4+^. (7) 
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To simplify the study, we consider only the subset of even values of z/, i.e. the 
values of c which are multiples of 2~^. Similarly, for 1/8 < |c| < 1/4, we consider 
only the representable values with v multiple of 4, and so on. In other words, in 
general we consider only the representable values of the form 

c = x2-f (8) 

where x is an integer satisfying 

< |x| < (9) 

Conversely, any such x corresponds to a value representable on the computer. 
What we have done here is simply to extract from the cumbersome variable-size 
lattice of representable points a subset of fixed size. In so doing, we eliminate 
some solutions of our problem; but, as will be seen, the number of remaining 
solutions is still large and should be sufficient for most applications. 

The same considerations apply to sin^, for which we consider only the repre- 
sentable values of the form 

s = y2-P (10) 

where y is an integer satisfying 

0<\y\<2P. (11) 



3 Some diophantine equations 

In this section we derive the equations satisfied by x and y for reasonable ampli- 
tudes of the roundoff error. 

1. We try first to find values of 6 for which there is no roundoff error, i.e. 
such that cos^ and sin^ are representable (in the restricted sense defined in 
the previous Section). We are thus led to seek the solutions of the diophantine 
equation 

x'' + y^ = 2'^ (12) 

where p is given, and x and y are unknown integers. 
Unfortunately, we have 

Theorem 1 The only solutions of ( |7^ are (x = ±2^, y = 0) and (x = 0, 

y = ±2P). 

We prove this recursively. If p = 0, the theorem is obviously true: + = 1, 
so = 1 and = 0, or conversely. Assume that the theorem has been proved 
for p — 1, with p > 0, and consider the value p. The right-hand side is even, and 
X and y are both even or both odd. If they are both odd, we have mod 4 = 1, 
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mod 4 = 1, while 2^^ mod 4 = 0: this is impossible. If x and y are both even, 
there is a solution x' = x/2, y' = y/2, for p' = p — 1. According to the theorem, 
this solution must be of the form {x' = ±2^~^, y' = 0) or {x' = 0, y' = ±2^~^); 
from which the theorem follows. 

These 4 solutions correspond to 6' = 0, 7r/2, tt, 37r/2, and are generally of no 
practical interest. 

2. The next best thing which we can try is to achieve an error 1. So we 
consider the diophantine equation 

X^+y^ = 2^P-l. (13) 

As above, p is given, and x and y are unknown integers. 
But there is 

Theorem 2 Eq. ( |73| j has no solutions for p > 0. 

Proof: mod 4 = or 1, mod 4 = or 1, while 2^^ — 1 mod 4 = 3: impossible. 

3. So we look now for solutions of 

x'' + y^ = + 1. (14) 

Fortunately, this equation always has solutions, and sometimes many of them 
(see Table 0). 

The roundoff error on + is now of the order of 2~^^ = 2~^^ only. The 
cumulative effect is €2 ~ 2"^*^^. This is represented by the full line in Fig. |l|. The 
situation is now inverted: the systematic error is negligible compared to the other 
errors for any realistic number of iterations. In fact both errors become of order 
unity after t = 2^^ ^ 3 x 10^^ steps. 

4. More solutions can be obtained (in order to have more choice for the value 
of 6), at the price of a larger roundoff error. We look then for solutions of 

X^+y^ = 2^P + k. (15) 

This is acceptable if k is not too large an integer. The systematic error after t 
steps becomes 63 ~ 2~^^kt. If we take for instance A; = 32 (see Sect. ^), then 
the error, represented by the dash-dot line in Fig. |l], is still quite acceptable; it 
becomes dominant only after t = 2^^ ^ 3 x 10^^ steps. 

We give now concrete recipes for the two cases of practical interest: single 
and double precision. 
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4 Single precision 



Because of elementary symmetries, it is clearly sufficient to consider the range 
< ^ < 7r/4. 

We use the IEEE754 standard value, p = 24. Eq. (jM]) has then only 4 solutions 
in the range < 6 < tt /A. Clearly this is insufficient for practical needs. So we 
enlarge our search and look for solutions of (|1^), with \k\ < /cmax- For instance 
for fcmax = 32, there are 54 solutions in the range < 9 < tt/4. These solutions 
are listed in Table |, sorted by increasing 6. 

This table is easily computed by scanning possible values of y, which are in 
the range < y < \J (2^^ + A;max)/2 = 11863283; this takes a few seconds on a 
workstation. (A minor technical problem is that the terms in ([ISD are too large 
for the standard integer format. This is solved by representing these terms as 
double precision numbers.) 

It can be seen that the values of 9 cover reasonably well the whole interval 
< < 7r/4. If more solutions are desired, at the expense of accepting larger 
roundoff errors, a larger table can easily be built. For instance if fcmax = 1000, 
the number of solutions increases to 869. 

A caveat is in order here: the value of 9 should never he directly used in the 
computation program. The values of 9 listed in Table |I| are not exact but rounded; 
they are given here only for illustration. Additional unwanted roundoff would 
occur in computing c and s from 6*, and the property (|T5|) would be destroyed in 
many cases. 

Instead, the values of c and s should receive independent names in the pro- 
gram, and should be computed directly from the exact values of x and y listed in 
Table |, using and (pUj). This computation should be done carefully, in such a 
way that no roundoff occurs. In Fortran, this can be done for instance with the 
instructions 

REAL*4 C, S 

C = 14842141. / 2. ** 24 
S = 7822137. / 2. ** 24 



5 Double precision 

For the IEEE754 standard value for double precision, p = 53, Eq. (|1^) has only 
8 solutions in the range < < 7r/4; so we must again turn to Eq. ([I5|) . 



Here it is not practical to tabulate solutions of (15) by scanning over y, as 
the range of possible values of y is of the order of 10^^. Instead we will use some 
classical results of the theory of numbers, which allow a systematic generation of 
the solutions. We review these results ffist. 
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X 


V 


B 


fc 


16777216 





0.0 





16777216 


1 


.00000006 


1 


16777216 


2 


.00000012 


4 


16777216 


3 


.00000018 


9 


16777216 


4 


.00000024 


16 


16777216 


5 


.00000030 


25 




8192 


.00048828 


4 


16776704 


131071 


.00781252 


1 


16761016 


737102 


.04394885 


4 


16760654 


745288 


.04443725 


4 


16756796 


827505 


.04934316 


-15 


16651675 


2048584 


.12241060 


25 


16566995 


2647575 


.15847021 


-6 


16564945 


2660371 


.15924263 


10 


16532686 


2853992 


.17094249 


4 


16423326 


3427731 


.20575745 


-19 


16389584 


3585598 


.21537962 


4 


16332540 


3837071 


.23074953 


-15 


16039629 


4919886 


.29762250 


-19 


15975413 


5124564 


.31040869 


9 


15751592 


5776013 


.35146882 


-23 


15554306 


6287968 


.38417252 


4 


15519136 


6374276 


.38972760 


16 


15486659 


6452780 


.39479142 


25 


15398649 


6660074 


.40821469 


21 


15359229 


6750486 


.41409362 


-19 


15263026 


6965272 


.42812149 


4 



X 


V 


e 


k 


15259624 




.42860965 


4 


15045797 


7422868 


.45831476 


-23 


14938544 


7636418 


.47255862 


4 


14842141 


7822137 


.48503090 


-6 


14803424 


7895164 


.48995756 


16 


14798175 


7904998 


.49062199 


-27 


14733952 


8024066 


.49868557 


4 


14604001 


8258216 


.51464749 


1 


14582860 


8295491 


.51720172 


25 


14539039 


8372056 


.52245995 


1 


14515158 


8413392 


.52530539 


-28 


14362958 


8670664 


.54312270 


4 


14188593 


8953145 


.56290949 


18 


14067585 


9142102 


.57628385 


-27 


13855696 


9460162 


.59906386 


4 


13851074 


9466928 


.59955226 


4 


13849473 


9469270 


.59972135 


-27 


13775055 


9577204 


.60753567 


-15 


13684899 


9705592 


.61688653 


9 


13421774 


10066328 


.64350099 


4 


13421771 


10066332 


.64350129 


9 


13416856 


10072882 


.64398939 


4 


13322259 


10197666 


.65332277 


-19 


13012020 


10590671 


.68316796 


-15 


12988527 


10619470 


.68538322 


-27 


12927484 


10693696 


.69111140 


16 


12058257 


11665051 


.76882501 


-6 



Table I: Solutions of ([T^) for p = 24, \k\ < 32. 
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5.1 Solutions of ^ = S: general properties 

Our problem is a particular case of a more general problem: find the solutions of 
the diophantine equation 

x^ + y^ = S. (16) 

S" is a given positive integer (we disregard the trivial case S = 0). This is a 
classical problem with a long history 0, Chap. VI]. 

It will be convenient here to revert to a consideration of the whole {x, y) 
plane. We call solution a pair of integers x and y satisfying (0). It will also be 
convenient to consider the (x, y) plane as the complex plane and to introduce the 
complex number 

z = x + iy = VSe'^. (17) 



( p!6[) can then be written 

zz = S. (18) 

Note that, for a given S, a solution can be specified simply by the value of 9. 

We call r{S) the number of solutions of (|I6|) . From any given solution one 
can deduce 3 other solutions (for the same S) by rotations of n/2, n, 37r/2. In 
complex notation: from any solution z we deduce 3 other solutions iz, i'^z, i^z. 
These 4 solutions are always distinct. We will call this a quadruplet of solutions. 

Therefore the number of solutions is a multiple of 4, and we write r{S) = 
4:h{S), where h{S) is the number of quadruplets. For instance: h{l) = 1, h{2) = 
1, h{3) = 0, h{A) = 1, h{5) =2, ... 

The total number of solutions up to a maximum, 

'5'max 

E (19) 

5=1 

is the number of points with integer coordinates inside or on the circle of ra- 



dius ■y/S'max it IS therefore of the order of tt 5'max |T2[' ^^"^ total number of 
quadruplets is J2h{S) ~ 7r5'max/4. From this we deduce that the average num- 
ber of quadruplets for a given S is {h{S)) = 7r/4. In practice, the solutions are 
unevenly distributed. For most values of 5", there are no solutions. The number 
of values S < S'max for which h{S) > is of the order of |T2 | 



0.76422— =i^. (20) 

V log -^max 

The probability that h{S) > for a given S is obtained by differentiating that 
expression: 

For values of interest here, S ~ 2^^^ ^ 10^°, this probability is about 0.09. 
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For any quadruplet generated by a solution 2;, there is a conjugate quadruplet 
of solutions generated by the conjugate value z. As is easily seen, there are 3 
cases: 

• z lies on one axis, i.e. y = ot x = 0; 6 mod 7r/2 = 0. In that case the 
quadruplet is identical with its conjugate; thus z generates only 4 distinct 
solutions. They correspond to ^ = 0, 7r/2, tt, 37r/2. S* is a square in that 
case. 

• z lies on a diagonal, i.e. |x| = \y\; 9 mod 7r/2 = 7r/4. In that case again the 
quadruplet is identical with its conjugate, and z generates only 4 distinct 
solutions. They correspond to 6* = 7r/4, 37r/4, 57r/4, 77r/4. 5* is a twice a 
square in that case. 

• z lies neither on one axis nor on a diagonal: 9 mod 7r/4 7^ 0. In that case 
the quadruplet and its conjugate are distinct, and z generates 8 distinct 
solutions. There is one of them in each of the 8 intervals j7r/4 < 9 < 
(j + l)7r/4,j = 0,l,...,7. 

5.2 Solutions for a given S 

The number of solutions for a given value of 5* can be determined as follows 0, p. 
242]. First we decompose S into prime factors. We distinguish 3 kinds of prime 
factors: 

• the factor 2, 

• factors fj equal to 1 (mod 4), 

• factors gj equal to 3 (mod 4), 
and we write the decomposition of S as 



We have then the following 

Theorem 3 // there exists an odd 7^, then h{S) = 0. // all 7^- are even, then 



S 



(22) 



j j 



h{S) = l[{f3, + l). 



(23) 



j 



Note that in the second case, h{S) is the number of divisors of Jlj fj\ i-e. the 
number of divisors of S made up of fj factors only. 
We determine now the solutions themselves. 
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1. We consider first the simple case where only one factor fj is present, and 
its exponent is f3j = 1; there are no factors 2 or gj. S is then a prime number 
equal to 1 (mod 4). According to the above theorem, in that case h{S) = 2 
pp. 219 and 241]: there are two quadruplets of solutions. 

The solutions do not lie either on an axis or on the first diagonal {S is not 
a square, nor twice a square). It follows that the two quadruplets are mutually 
conjugate. 

We call Zj = Xj + iyj the solution with 0<6'<7r/4(0<?/<x). An 
algorithm exists to compute that solution for any fj [|]. The solutions for the 
first few factors fj are given in Table |I[ The two quadruplets are generated by 
Zj and Zj. 



J 


/. 


Xj 


Vj 


1 


5 


2 


1 


2 


13 


3 


2 


3 


17 


4 


1 


4 


29 


5 


2 


5 


37 


6 


1 


6 


41 


5 


4 


7 


53 


7 


2 


8 


61 


6 


5 


9 


73 


8 


3 



Table II: Solutions in the case S = fj. 



2. We consider next the case where only a factor fj is present, but with an 
arbitrary exponent f]j. All quadruplets are then given by 

z = Zj'Zj^'-^' (24) 

where Xj can take the values 0, 1, . . . , jSj, and Zj is read from Table This 
produces the required number of quadruplets h{S) = [3j + 1. 

Example: S = 625 = 5^. Then Zi = 2 + i, and the solutions for z are: 
zf = -7 - 24i, zizf = 15 - 20i, z\z\ = 25, zfzi = 15 + 20z, zf = -7 + 2Ai. 

3. We consider now the case with more than one fj, but still no factors 2 or 
gj. All quadruplets are then given by 

z = X{z]'z^'~^' (25) 
i 

where Xj can take the values 0, 1, . . . , This produces a number of quadruplets 
h{S) = YijiPj + 1)) which is the required number. 
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Example: S = 1025 = 5^ x 41. There is: /i 
= 1, ^2 = 5 + 4i. (H) gives 



5, A = 2, = 2 + 2, /2 = 41, 



ZiZi 

\ 4 



Z2 
Z2 




(26) 



where one factor should be chosen in each column. This gives the 6 solutions 
-1 - 32i, 31 - 8i, 25 - 2Qi, 25 + 20i, 31 + Si, -1 + 32^, corresponding to 6 distinct 
quadruplets. The quadruplets are conjugate two by two; so there are only 3 
fundamentally different solutions. In the interval < < 7r/4, these solutions 
are, in terms of x and y: (32, 1), (31, 8), (25, 20). 

4. Finally, we consider the completely general case where the exponents a 
and I3j in (|2^) are arbitrary, and the 7^ are even but otherwise arbitrary. All 
quadruplets are then given by 



^3 7^P3~^3 

^3 



(27) 



5.3 Solutions for 5 = 2^^ + 1 

In the double precision case, comparatively large values oi \k\ can be accepted 
in (0); even with \k\ = 10^, for instance, the roundoff error at each step will be 
of the order of 10~^^ only. Thus, many more solutions can be generated than is 
needed for applications. We can therefore restrict our attention to some subset of 
solutions. We will consider values of k of the form k = 2^"^, with g > 0. Consider 
a solution {x,y) of ([T3|). Then x' = x/2'^, y' = y/2'^ verify 



+ y" = 2^" + 1 (28) 

with n = p — q. Thus, our choice of values of k is equivalent to considering values 
of S of the form S{n) = 2^" + 1, with n < p. 

These values have some nice properties. In particular, 

• All prime factors of S{n) are equal to 1 (mod 4). This is shown as follows: 
a prime factor d of 2^" + 1 must be odd. Since 2^" is a square, —1 is a 
quadratic residue (mod d). It follows that {d — l)/2 is even |TT| . 



• A prime factor of S{n) is also a prime factor of S{3n), S{5n), . . . This is 
obvious from the identity a^^'^^ + b'^^~^^ = (a + b){a'^^ — a?^~^h + a?^~'^h'^ — 
... + b^^, taking a = 2^", 6 = 1 and j = 1, 2, . . . 

As a result, the equation x'^+y'^ = S{n) = 2^*^+1 tends to have many solutions. 
Table |ITT| gives the number of quadruplets h{S) for n = 1 to 60. This number 
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X 
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22 
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8 
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23 


32 


9 


16 




24 


8 


10 


4 




25 


64 


11 


8 




26 


8 


12 


8 




27 


64 


13 


16 




28 


8 


14 


4 




29 


8 


15 


48 




30 


16 



Table III: Numb 
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46 
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u^ 
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u^ 
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O XZi 


37 


32 




52 


4 


38 


16 




53 


16 


39 


768 




54 


64 


40 


8 




55 


96 


41 


32 




56 


32 


42 


32 




57 


256 


43 


32 




58 


8 


44 


16 




59 


128 


45 


1536 




60 


64 



of quadruplets. 



was computed by factoring S into prime numbers (with the help of Maple) and 
using Eq. (H). 

For machines with p = 53, a particularly good value is n = 51, for which there 
are 9 prime factors: 

2i°2 ^ I ^ 1326700741 x 26317 x 13669 x 3061 x 953 x 409 x 137 x 13 x 5. (29) 

Thus the total number of quadruplets is 2^ = 512. They are given by the equation 

. _ / 2 - i W 3 - 2i W 11 - 4i W 20 - 3i W 28 - 13i \ 
^ + '^y - [ 2 + i j ( 3 + 2i j ( + j ( 20 + 3i J I 28 + 13i J 



(30) 



/ 55 - 6i W 113 - 30z W 154 - 51i \ / 30346 - 20145i \ 
55 + 6z J \ 113 + 30Z J \ 154 + 5n J \ 30346 + 20145z ) 

where one factor should be chosen inside each set of parentheses. 
The angle 6 is correspondingly given by 

with 6*1 = arctan 1/2, 62 = arctan2/3, . . . Approximate values of the 6j are listed 
in Table Here again, we point out that these values are given only to allow an 
estimate of 6 for a given combination; they should never be used in the program. 
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J 




1 


0.46364761 


2 


0.58800260 


3 


0.34877100 


4 


0.14888995 


5 


0.43467022 


6 


0.10866122 


7 


0.25950046 


8 


0.31980124 


9 


0.58604567 



Table IV: Values of Qj for n = 51. 



Instead, the exact values of x and y should be computed from (^) for the chosen 
combination, and then used to compute c and s as explained in Sect. ^. 

The total number of solutions is 2048, out of which 256 lie in the interval 
< < 7r/4. The values of Q cover the circle quite well: the maximal difference 
between two successive values is about 0.027. 

Another good value is n = 45; there is 

+ 1 = 29247661 x 54001 x 1321 x 181 x 109 x 61 x 41 x 37 x 13 x 5^ (32) 

and the total number of quadruplets is 2^ x 3 = 1536. This value of n should be ap- 
propriate in particular for machines with p = 48, like some CRAYs (C90/YMP). 
The quadruplets are given by the equation 



X + iy 





( 10 - 3i W 10 - 9i W 36 - 5i W 199 - 120i \ I 5331 - 910i 
10 + 3i + J 36 + 5i + 120i ) \ 5331 + 910i 

The angle 9 is correspondingly given by 



(33) 





(34) 



with 6i = arctan 1/2, 62 = arctan2/3, . . . Approximate values of the 6j are listed 
in Table 0. 

The total number of solutions is 6144, out of which 768 lie in the interval 
< 6 < TT /A. The values of 6 cover the circle quite well: the maximal difference 
between two successive values is about 0.005. 
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J 




T 

X 




2 


0.58800260 


3 


0.16514868 


4 


0.67474094 


5 


0.69473828 


6 


0.29145679 


7 


0.73281511 


8 


0.13800602 


9 


0.54263352 


10 


0.16907011 



Table V: Values of 9j for n = 45. 



Note that some computers, like the VAXs, use a double precision represen- 
tation with p = 56. However, the values of h{S) for n from 52 to 56 are rather 
small and the values of 6 cover the circle rather sparsely. 

Incidentally, the mathematical approach used in the present section could 
also be used in the case of single precision (Sect. §). But in that case a simple 
scanning method is more convenient. 

6 Numerical simulations 

We now present numerical verifications of these results. All rotations will be 
performed using the traditional mapping (p. We point out, however, that there 
exist other numerical implementations of rotations with good behaviour over a 
large number of iterations 0. 

6.1 Simple rotation 

All computations will be made in double precision on a Silicon Graphics Power In- 
digo 2 computer running a MIPS R8000 processor which conforms to the IEEE754 
standard for number representation. We first study the effect of a large number 
of iterations of the mapping (|I]). We compare random angles, some special angles 
found by chance to behave well, and the "good" angles found in the previous 
section. 

The quality of the computation is determined by the conservation of the radius 
R = VX^ + F2. Let Ro = ^/xf+Y^ be the radius of the initial point (Xq, Yq). 
As we iterate the mapping, we record the absolute value of the relative error 
\R^/R^ ~ 1|- Foi' each rotation angle 9, the relative error is averaged over 20 
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initial conditions chosen at random. The value obtained is representative of what 
really happens for all initial conditions, since the standard deviation remains very 
small. 

In a first series of runs, we scanned the range < 6' < 7r/2 with values of the 
form 9 = j/512, j = 1 to 802. These values being representable, we can reproduce 
the exact same value of 6 on any computer. Any other value of 6 stored in the 
computer would give an error in c and s of the same order of magnitude. However 
it would not be easy to kmow the exact value of 6 and reproduce the results 
on different computers. Typical results are shown in Fig. |^, solid lines. We 
observe a linear drift of the square radius as expected. However, some particular 
angles give somewhat better results (Fig. ^ja, dashed lines). For these angles, the 
roundoff error on + happens to be small and for up to 10^ iterations, the 
random errors due to other parts of the computation are dominant. Eventually, 
the linear drift emerges. The particularly good angles presented here correspond 
to j = 126, 248, 357, 423, and 700. 

In a second series of tests, we used values of the form 6 = j7r/2000, j = 1 
to 999. For most values of j, the results are similar to those of the previous 
case (Fig. ^ upper curves). However, we found 3 values of 9 (j = 40, 250, and 
450) with a peculiar behaviour, as shown in Fig. ^ (lower curves). For 10^ to 
10^ iterations, the square radius drifts linearly, and then it seems to lock on a 
particular value. 

Inspection of the numerical results shows that a periodic cycle of the mapping 
is reached. This is made possible by the low-order commensurability of 6 with 
2ti. Indeed 9 = 27r/100, 27r/16, and 9 x 27r/80 respectively and the observed 
periods are 100, 16, and 80. 

Finally we tested the "good" values of 9 defined by the solutions of Eq. 



with n = 45 and 51 (see Table [V^ ). These solutions are integers < 2"". Hence, 
they are representable as double precision float numbers on machines with 53 
bit mantissa. Similarly 2^^ and 2^^ are representable, being powers of 2. So 
assigning the solutions of Eq. (P^ ) to variables and dividing them by 2^^ or 2^^ 
give the desired representable number. We next iterate the mapping. In both 
cases, the random drift due to the other parts of the computation dominates over 
the linear drift for longer than the 10^ iterations performed here, and the overall 
error remains very small (Fig. |^c for n = 45 and Fig. |^d for n = 51). 



6.2 Integration in a rotating frame 

We came to consider this problem through the numerical study of the long term 
dynamics of Dactyl, Ida's satellite [§, |^. This required the use of a symplectic 
integrator in a rotating frame, thus involving a rotation. So we want to check the 
effect of combining the rotation with the iteration of the symplectic integrator of 
order 2 (SI2). 

The implementation of SI2 we use is the generalized leap-frog described by 
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Figure 2: Relative square radius errors (absolute value) as a function of the 
number of steps, (a): d = j/512, (b): d = j7r/2000, (c): solutions of equation (p^) 
with n = 45, (d): solutions of equation ( ]28|) with n = 51. 
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Yoshida 11131 . We write the Hamiltonian in the form 



n = HiiL, G, H) + 7^2(X, F, Z). 



(35) 



Here, Tii is the Hamiltonian of the two-body problem in a rotating frame, with 
a primary which has the same mass as the primary of the actual problem: 



L, G, and H being the Delaunay variables, uj the rotation speed of the rotating 
frame, and /i the product of the gravitational constant and the reduced mass of 
the two bodies. 7^2 represents the perturbation potential, namely the difference 
between the real potential and the point mass potential: 



To integrate from time t to time t + r, we integrate 7^2 for r/2, then Tii for r 
and finally 7^2 for r/2 again. 

The rotation occurs in the integration of Tii because of the term —uoH. Using 
the / and g Gauss functions , one can integrate the keplerian Hamiltonian in a 
fixed frame, — /i^/2L^, over any time interval r, directly in cartesian coordinates. 
We must then rotate the position and velocity vectors by an angle lot around the 
rotation axis. 

Symplectic integrators are known to behave correctly on the long run, i.e. 
they do not exhibit linear drifts in energy. But on a short time scale, they may 
have quite large oscillating errors. The amplitude of the oscillations are many 
orders of magnitude larger than the linear drift over a period. We average the 
energy error over a large number of iterations to see the secular error rise above 
the oscillating error. 

In Fig. ^, we present the evolution of the absolute value of the relative energy 
error for two different sets of angles. For each set, we either choose the time step 
and derive the rotation angle from the rotation speed, or we take a "good" angle 
of the same order of magnitude for n = 45 or n = 51. Fig. ^ shows the error over 
10^ iterations for an angle 6 = 0.0753 (solid line), 9 ~ 0.07511325 (dashed line), 
and 6 ~ 0.07194054 (dotted line) (see table |V11| , top lines). The energy error was 
averaged over 5 x 10^ iterations for each data point. For Fig. |]b, we integrated 
for only 10^ iterations but with an angle about 10 times smaller: 9 = 0.00753 
(solid line), 9 ^ 0.00772303 (dashed line), and 9 ~ 0.00781249 (dotted line) (see 
table |V11| , bottom lines). For this figure, the energy error was averaged over only 
10^ iterations. 

Clearly the use of solutions of Eq. (^) gives very good results. We do not 
see any linear drift in energy. However, this technique can be used only if we 
are free to choose the integration time step, as in the case of SI2. For example. 



7^l 




-iuH. 



(36) 



^2 = Up,rt{X, Y, Z) = U{X, Y, Z) + 



R 



(37) 
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n = 45 


X 


y 


e 


35004143579815 


3556679846300 


0.10125988 


34476730568729 


7021046116972 


0.20089880 


33597753939071 


10446576929072 


0.30145465 


30876883071208 


16868850912031 


0.50001828 


26872087044097 


22711912671104 


0.70169266 


n = 51 


,(■ 


y 


e 


2240341265158844 


226877536436263 


0.10092511 


2201219968984456 


474587240722913 


0.21235141 


2150106539295032 


669062232227809 


0.30167850 


1963938109574759 


1101612228814132 


0.51118844 


1721715036961844 


1451309661103513 


0.70038350 



Table VI: Values of x and |/, and corresponding 6', used in the numerical tests of 
the rotation. 



n 


X 


y 


e 


45 


35085163629799 


2640328077268 


0.07511325 


51 


2245975296866668 


161856006306841 


0.07194054 


45 


35183322803560 


271727410975 


0.00772303 


51 


2251731094732799 


17591984718848 


0.00781249 



Table VII: Values of n, x and y, and corresponding ^, used in the numerical tests 
of the symplectic integrator. 
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Number of Iterations 




Number of Iterations 



Figure 3: Relative energy errors (absolute value) as a function of the number 
of steps, (a): "normal" angle 6 = 0.0753 (solid line) and "good" angles of 
approximately the same amplitude for n — 45 (dashed line) and n — 51 (dotted 
hne). (b): same as (a), but for an normal angle of 0.00753, and corresponding 
good angles. 
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symplectic integrators of order 4 (SI4) or 6 as described in []T3| require the use of 
different time steps in a very precisely given relation: integrating over r with SI4 
corresponds to using SI2 with time step t/{2-2^/^), then -2^/^/(2-2^/^), and 
finally r/(2 — 2^/^) again. This cannot be achieved with solutions of Eq. (pSf ). 
For such completely different implementation of the rotation can be used, 

which yields good results Pj. 
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